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1. INTRODUCTION 

The probability mass function of the Borel- Tanner distribution is 

p(x|^,r) = a,(x)r-"e-^^ (x = r, r + 1, . . .) (1) 

where < ^ < 1, r is a positive integer and ar{x) = rx^~'"~^/(x — r)! 

Initially (1) was derived as the probability distribution of the number of customers served 
in a queuing system. It also appears in random trees and branching processes. More specifi- 
cally, it is the distribution of the total progeny in a Galton- Watson process assuming Poisson 
reproduction, see Aldous [1] for recent applications. Our interest in estimating 6 stems from 
its role as reproduction number of an epidemic infection modeled by a branching process, see 
Farrington et al. [2]. We study nonparametric (with respect to the prior) empirical Bayes 
(NPEB) estimators for 6. The NPEB estimation procedures rely on the assumption for ex- 
istence of a prior distribution G which, however, is unknown. Consider independent copies 
(Xi,6'i), . . . , (X„+i,^^n+i) of {X,9), where 9 has a distribution G, and conditional on 9, X 
has a Borel-Tanner distribution given by (1). The "past" data consist of independent obser- 
vations xi,X2, ■ ■ ■ ,Xn obtained with independent realizations 9i,92, ■■■ ,9n of 9, where the XjS 
are observable and the 9iS are not observable. Denote by 9n{x) an empirical Bayes estimator 
for 9 based on the "past" data and the "present" observation Xn+i = x. As Maritz and Lwin 
[5] point out, an advantage of using NPEB estimators is the minimum assumptions on the 
class of prior distributions. It turns out that in the case of Borel-Tanner distribution the Bayes 
rule assuming LINEX loss depends on the prior through the marginals only. This remarkable 
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fact allows us to construct simple NPEB estimators estimating the Bayes rule directly without 
estimating the prior itself. 

Notice that NPEB estimators for 6 under weighted squared-error loss are studied in Yanev 
[7]. In the next section we use the asymmetric LINEX loss function, instead. In Section 3 
we prove the estimators asymptotic optimality. The last section contains numerical results 
concerning the estimators performance measured by their regret risk. 



2. EMPIRICAL BAYES ESTIMATION USING LINEX LOSS 



In some applications (e.g. surveillance of infectious diseases) the squared-error loss function 
seems inappropriate in that it assigns the same loss to overestimates as to equal underesti- 
mates. A well-known alternative (see Huang et al. [3] and the references therein) is the LINEX 
loss function defined, for ji < j < and 7 7^ by 

L,(e, e) = e^(^-^) - ^{e - ^) - 1 , (2) 

where 6 is an estimator for 6. It is clear that the LINEX loss function is convex, asymmetric 
and for 7 > it increases almost linearly for negative errors and almost exponentially for 
positive errors. Thus, it penalizes an overestimation more seriously than an underestimation. 
This is reversed when 7 < 0. For small values of I7I the LINEX loss is close to the squared- 
error loss. Prom now on we assume that 7 is a positive integer; the case 7 < can be treated 
similarly. 

Based on a single observation, the maximum likelihood estimator 6mle{x) for 6 is (e.g., 
Kumar & Consul [4]) 

Omle(x) = - — -. (3) 

X 

Denote by the indicator of the event A. 

Theorem 1 Assume LINEX loss with 7 > 0, integer. A NPEB estimator for 6 in (1) is 

^„(x) =7-MnT„(x)T^^^(^) e (l,e^)} + (^-^)MK(x) ^ (l,e^)}' (4) 

where 

I \ r -|- 7 -|- 7\^~''~-'- mn{x\r\ 
Tn(a;) = 



X ) m„(x 7|r 7) 

and mn{z\y) is an estimate for the marginal distribution mG{z\y) = Jq p{z\9,y)dG{9). 
Proof The Bayesian estimator 9g{x) under LINEX loss is (e.g. Huang et al. [3]) 

M^) = -7-Mn£;G|.e-^^ (5) 

provided that Eg\x^~^^ < 00, where Eg\x{-) is the expectation w.r.t. the posterior. Since 

EG\.e-''' = ^ f\-^'arix)e^-'e-^'dG{9) 

mG[x\r) Jo 

ar{x) mcix + j\r + 

ar+j{x + 'y) mG{x\r) 
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we can write the Bayesian estimator 9g{x) from (5) as 



r + 7 /a; + 7\^ ^ ^ mG{x\r 



X J mG(a; + 7|r + 7) J 
= j~^Iiitg{x), say. 

Note that, 9g{x) depends upon the prior through the marginal distribution only. Therefore, 
estimating the marginals, we can construct a NPEB estimator 9n{x) for 9 as given in (4). □ 
One possible form of the estimators mn{z\y) in Theorem 1 can be obtained as follows. In 
addition to the current X„+i(r) = x, let us have observed n independent pairs 

(Xi(r), Xi(7)), (X2(r), ^2(7)), . . . , (X„(r), X„(7)), (6) 

where Xi{r) and Xi{'^) are independent and Borel- Tanner distributed with p{x\9i,r) and 
p(a;|^i,7), respectively. It is known (e.g., Kumar & Consul [4]) that Xi{r) + Xj(7) has pmf 
p{x\9i^ r+7). Let /ri(y|r+7) be the number of pairs, such that Xj(r)+Xj(7) = (i = 1, . . . , n). 
Consistent estimators for the marginals mG(a;+7|r+7) and mG{x\r) are the relative frequencies 

m„(a; + 7 r + 7) = and m„(a; r) = . (7) 

n + 1 n + 1 

Let us notice here that a NPEB estimator 9n{x) for 9 under the squared-error loss L{9, 9) — 
{9 — 9y is constructed in Yanev [7] as follows 

On{x) = i^n{x)T5^^^^^-^ e (0, 1)} + - i (0, 1)}' 

where 

where, as before, ar{y) — ry'^~^~^ /{y — r)\ 



3. ASYMPTOTIC OPTIMALITY 

The Bayes risk of an estimator 9 can be written as 

R{G, 9)^ j^j^ L{9, 9)p{x\9, r)dG{9)dx = E ^ ^(^' 9)p{9\x)dG{9)mG{,x\r), 

where p{9\x) is the posterior distribution. If R{G, 6'„|X„) is the conditional Bayes risk of the 
estimator 9n{x) given X,„ = (Xi, . . . then R{G,9n) = En{R{G,9n\2Ln)} is the (uncon- 

ditional) Bayes risk of 9n, where the expectation En{-) is taken with respect to X„. The 
estimator 9n{x) is asymptotically optimal for given G if \imn^^R{G,9n) = R{G,9g)- We 
shall prove the asymptotic optimality of 9n{x). 
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First, let us find the minimum Bayes risk R{G, 6g) attained by the Bayesian estimator 
9g{x). Since (5) imphes exp (7^(3(a;)) ex.Y){—^9)p{9\x)d9 — 1, we have 

= je^^cW e-^^p{9\x)d9 - -f9G{x) + £ -f9p{9\x)d9 - l| mG{x\r) 
— X/|y 'l9p{9\x)d9 — ■y9G{x)^ mG{x\r). 
Next, using exp {^yOGix)) Jq exp(—'y9)p{9\x)d9 = 1 again, we obtain 

R{G, 9n) = £ |e^'^"(^) £ e-^^p{e\x)de - -^x) + £ ^9p{9\x)d9 - l| mG{x\r) 
= £ ^" |g7(en(a.)-0G(x)) _ ^ J\9p{9\x)d9 - l| mG(x|r). 

Therefore, 

CO 

R{G, 9n) - R{G, 9g) ^Y.En [e<^'-^-^-'o{x)) _ _ q^^^^^ _ i| mG{x\r) (8) 

x=r 

Let us truncate the Borel-Tanner distribution (1) starting with r = A; as follows 

where is a positive integer. Denote the truncated marginal by mG^x) = Jq p*{x\9,y)dG{9). 
Similar to the non-truncated case, ifr<a;<r-|-A'" — 1 then 

ttr+'fix + j) 'm*G{x\r) 0^+^(2; + 7) mG{x\r) tg{x) 

lix^r + N then 



k=r+N 



mG(r + A^|r) a^+^(A; + 7) 

00 \ 00 

mG{k\r)/TG{k) / Y rnG{k\r). 

yk=r+N / k=r+N 

Let T*a{x) = Tg{x) if r < x < r + N - 1; = ET=r+N^G{k\r)/ Ek=r+N {mG{k\r)/TG{k)) if 
x = r + N. The Bayesian estimator in the truncated case is given by 9q{x) = 7^^ In Tq{x). Let 
us estimate mG{x\y) by m„(a;||/) as in Theorem 1 and set t*{x) = Tn{x) if r < x < r + N — 1; 
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= llk'=r+N''^n{k\r)/ Yl'^=r+N {'fnn{k\r) / Tn{k)) H X ^ T + N . We construct a NPEB estimator 
in the truncated case as follows 

^:(a;) =7-'lnr*(a;)J|^*^^) e (l,e^)} + " ^ (l,e^)}- 

Now, we are in a position to prove the asymptotic optimality of 9n{x). 

Theorem 2 Assume prior G with finite first moment. If mn{z\y) is a consistent estimator 
for mG{z\y), then the NPEB estimator 9n{x) given by (4) is asymptotically optimal, i.e., 

lim R(G,en) = R(G,eG). 

Proof Since 9g{x) is the Bayesian estimator, we have R{G, On) > R{G, 9g) and thus 

R{G,dr:)-R{G,eG) < \R{G,en)-R{G,ei)\^R{G,ei)-R{G,eh)\ + \R{G,e*a)-R{G,dG)\ (10) 

To prove the theorem it is sufficient to show that the right hand side of (10) has limjv^oo limsup^^ 
equals zero, when N is from (9). The truncated analog of (8) leads to 

r+N 

\R{GX)-R{G.Oh)\ = E i5n{e^(^^(^)-^«(^» -7(^:(^) -^g(^)) - l}ma(a;|r) 

x=r 

Since mn{z\y) is a consistent estimator for mG{z\y), we have lim„^oo ^n(^) = ^G'(^)' ^■^■■> 

where F°° is the product measure induced by Xi, X2, . . . , X„, Notice that, both ^* and 

9g are bounded. Indeed, 9^ is bounded by definition and < 9g{x) — — (I/7) In EG\x{e~''^) < 
(1/7) In e''' = 1. Therefore, by the Lebesgue dominated convergence theorem we can pass to 
the limit inside the expectation in the right hand side above and obtain 

\im\RiG,9:)-R{G,9*a)\^0. (11) 

Also, since p*{9\x,r) = p{9\x,r), m*Q{x) = mG^x) iox r < x < r + N — 1, and m*Q{r + N) — 
Yl'^=r+N ^g{x) it is not difficult to obtain 

\R{G, 9*a)-RiG, 9g)\ = £ { /' 7^ {P*iO\r + N) - p{9\x)) d9 - ^ (^^(r + N) - 9Gix))] rriGix). 

x=r+N ^-^0 ^ 

Since \p*{9\r + N) - p{9\x)\ < 1, |6'^(r + N) - 9g{x)\ < 1, and E9 < 00 we have 

hm \R{G,9*a)-R{G,9G)\ = 0. (12) 

N-*oo 

Similar to (12) one can prove that limAr^oo \R{G, 9n)—R{G, 6'*)| = 0. This along with (10)-(12) 
completes the proof. □ 
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4. NUMERICAL EXAMPLES 



Using the notation introduced before (7) we set 



X J /n(a; + 7k + 7) 



Let A — |r/(x) e (1, e'"') n + 7|r + 7) 7^ o| and A'^ be its complement. Making use of 
the relative frequency estimators (7) consider 9l{x) to be defined by 

^Ux) = 7^1 lnr/(a;)J^ + {x - r)/xXj^c. 

That is, if A occurs, then we estimate 9 by 7"^ lnT/(a;); whereas if A'^ occurs then we use the 
MLE (3) for 6 instead. 

A popular measure of the performance of one estimator 9[x) is its regret risk S{9) — 
R{G, 9) — R{G, 9g) > 0. For our simulation study we take r = 5, Uniform (0.5, 1) prior and 
LINEX loss with 7 = 3. Then the minimum Bayes risk attained by the Bayesian estimator 



9u{x) = -In 



1 C 9''-^e-''^d9 
^ ' '0.5 



J0.5 



is i?(f/(o.5,i),^i/) = 0.0622. 

In the empirical Bayes scheme (6), let us set n = 50. Selecting 50 random values for 
9i ~ U(o.5,i), i = 1,2, ... ,50, we generate two sets of 50 branching processes starting with 
r = 5 and 7 = 3 ancestors, respectively and both having Poisson{9i),i = 1, 2, ... 50 offspring 
distributions. Notice that the total progeny of each process is a reahzation of a Borel-Tanner 
{9i, ■) random variable. Repeating the above procedure 100 times, we obtain 100 samples of 
50 pairs Borel-Tanner observations, (Xj(5), Xj(3)), i = 1, 2, . . . , 50. Each sample gives us a 
NPEB estimate ^50(2^) with regret risk 5*^(^50), i = 1,2, . . . , 100. We estimate the regret risk 
S{9lo) with the average S{9lo) = E-=i ^i(^{o)/100. 

The above scheme is repeated with n = 75 and n — 100. As an illustration, we present in 
Table 1 results for one sample with n — 100 . For this particular sample, <S'i(^{oo) — 0.0980, 
which is less than S{9mle) — 0.1327. 



X 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 


^100 (^) 


.46 


.69 


.92 


.65 


.58 


.51 


.55 


.53 


.62 


.96 


.61 


.69 


.16 


.72 


.79 


.75 


9u{x) 


.63 


.64 


.65 


.65 


.66 


.67 


.67 


.68 


.69 


.69 


.70 


.71 


.71 


.72 


.73 


.73 


9mle{x) 





.16 


.28 


.38 


.U 


.50 


.55 


.58 


.62 


.64 


.67 


.69 


.71 


.72 


.74 


.75 



Table 1: Estimates 9l{x), 9u{x), and 9mle{x) for 9 from a sample with n = 100. 
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n 


< x < 15 


< x < 200 




STl){Si0{,Y) 






STi)[S(el)) 




50 


0.1211 


0.0037 


0.1292 


0.1397 


0.0037 


0.1327 


75 


0.1076 


0.0036 


0.1292 


0.1300 


0.0037 


0.1327 


100 


0.1038 


0.0033 


0.1292 


0.1299 


0.0036 


0.1327 



Table 2: Numerical results on regret risks of Ol{x) and 9mle{x). 



The numerical results for the regret risks are given in Table 2. Several comments are in 
place. For small x, (columns 2-4) and n = 75 or 100, the improvement of over ^MLi? is 
substantial. Overall, (columns 5-7), the regret risk of 9^^ is not higher than that of 9mle- 

Finally, note that the Borel- Tanner distribution (1) has monotone hkehhood ratio in x, 
i.e., p{x\6' ,r)/p{x\6,r) is an increasing function of x whenever < < < 1. This suggests 
that the NPEB 9n{x) can be improved on by the monotonizing procedure of Van Houwelingen 
and Stijnen [6]. 
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